AITopics | natural parameter space

This paper deals with the following common type of scenario: several users engage in some strategic online interaction, where each of them is assisted by a learning agent. A typical example is advertisers that compete for advertising slots over some platform. Typically, each of these advertisers enters his key parameters into some advertiser-facing website, and then this website's "agent" participates on the advertiser's behalf in a sequence of auctions for ad slots. Often, the platform designer provides this agent as its advertiser-facing user interface. In cases where the platform's agent does not optimize sufficiently well for the advertiser (but rather, say, for the auctioneer), one would expect some other company to provide a better (for the advertiser) agent.

agent, equilibrium, nash equilibrium, (16 more...)

arXiv.org Artificial Intelligence

2112.0764

Country:

Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Mathematics of Computing (0.93)

Add feedback

The statistical Minkowski distances: Closed-form formula for Gaussian Mixture Models

Nielsen, Frank

arXiv.org Machine LearningJan-17-2019

The traditional Minkowski distances are induced by the corresponding Minkowski norms in real-valued vector spaces. In this work, we propose novel statistical symmetric distances based on the Minkowski's inequality for probability densities belonging to Lebesgue spaces. These statistical Minkowski distances admit closed-form formula for Gaussian mixture models when parameterized by integer exponents. This result extends to arbitrary mixtures of exponential families with natural parameter spaces being cones: This includes the binomial, the multinomial, the zero-centered Laplacian, the Gaussian and the Wishart mixtures, among others. We also derive a Minkowski's diversity index of a normalized weighted set of probability distributions from Minkowski's inequality.

divergence, formula, minkowski, (15 more...)

arXiv.org Machine Learning

1901.03732

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Add feedback

A convex program for bilinear inversion of sparse vectors

Aghasi, Alireza, Ahmed, Ali, Hand, Paul, Joshi, Babhru

Neural Information Processing SystemsDec-31-2018

We consider the bilinear inverse problem of recovering two vectors, x in R^L and w in R^L, from their entrywise product. We consider the case where x and w have known signs and are sparse with respect to known dictionaries of size K and N, respectively. Here, K and N may be larger than, smaller than, or equal to L. We introduce L1-BranchHull, which is a convex program posed in the natural parameter space and does not require an approximate solution or initialization in order to be stated or solved. We study the case where x and w are S1- and S2-sparse with respect to a random dictionary, with the sparse vectors satisfying an effective sparsity condition, and present a recovery guarantee that depends on the number of measurements as L > Omega(S1+S2)(log(K+N))^2. Numerical experiments verify that the scaling constant in the theorem is not too large. One application of this problem is the sweep distortion removal task in dielectric imaging, where one of the signals is a nonnegative reflectivity, and the other signal lives in a known subspace, for example that given by dominant wavelet coefficients. We also introduce a variants of L1-BranchHull for the purposes of tolerating noise and outliers, and for the purpose of recovering piecewise constant signals. We provide an ADMM implementation of these variants and show they can extract piecewise constant behavior from real images.

artificial intelligence, data quality, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Asia > Pakistan > Punjab > Lahore Division > Lahore (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Quality > Data Transformation (0.34)

Add feedback

A convex program for bilinear inversion of sparse vectors

Aghasi, Alireza, Ahmed, Ali, Hand, Paul, Joshi, Babhru

Neural Information Processing SystemsDec-31-2018

We consider the bilinear inverse problem of recovering two vectors, x in R^L and w in R^L, from their entrywise product. We consider the case where x and w have known signs and are sparse with respect to known dictionaries of size K and N, respectively. Here, K and N may be larger than, smaller than, or equal to L. We introduce L1-BranchHull, which is a convex program posed in the natural parameter space and does not require an approximate solution or initialization in order to be stated or solved. We study the case where x and w are S1- and S2-sparse with respect to a random dictionary, with the sparse vectors satisfying an effective sparsity condition, and present a recovery guarantee that depends on the number of measurements as L > Omega(S1+S2)(log(K+N))^2. Numerical experiments verify that the scaling constant in the theorem is not too large. One application of this problem is the sweep distortion removal task in dielectric imaging, where one of the signals is a nonnegative reflectivity, and the other signal lives in a known subspace, for example that given by dominant wavelet coefficients. We also introduce a variants of L1-BranchHull for the purposes of tolerating noise and outliers, and for the purpose of recovering piecewise constant signals. We provide an ADMM implementation of these variants and show they can extract piecewise constant behavior from real images.

artificial intelligence, data quality, machine learning, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Quality > Data Transformation (0.34)

Add feedback

An Elementary Proof of Convex Phase Retrieval in the Natural Parameter Space via the Linear Program PhaseMax / Compressed Sensing from Phaseless Gaussian Measurements via Linear Programming in the Natural Parameter Space

#artificialintelligenceNov-21-2016, 09:15:20 GMT

The phase retrieval problem has garnered significant attention since the development of the PhaseLift algorithm, which is a convex program that operates in a lifted space of matrices. Because of the substantial computational cost due to lifting, many approaches to phase retrieval have been developed, including non-convex optimization algorithms which operate in the natural parameter space, such as Wirtinger Flow. Very recently, a convex formulation called PhaseMax has been discovered, and it has been proven to achieve phase retrieval via linear programming in the natural parameter space under optimal sample complexity. The current proofs of PhaseMax rely on statistical learning theory or geometric probability theory. Here, we present a short and elementary proof that PhaseMax exactly recovers real-valued vectors from random measurements under optimal sample complexity.

artificial intelligence, machine learning, natural parameter space, (8 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.78)

Add feedback

Higher-order asymptotics for the parametric complexity

Dowty, James G.

arXiv.org Machine LearningOct-29-2015

The minimum description length (MDL) principle provides a general information-theoretic approach to model selection and other forms of statistical inference [5, 17]. The MDL criterion for model selection is consistent, meaning that it will select the data-generating model from a countable set of competing parametric models with probability approaching 1 as the sample size n goes to infinity [4]. For example, if each of the parametric models is a logistic regression model with predictor variables taken from a fixed set of potential predictors, then the MDL model-selection criterion will choose the correct combination of predictors with probability approaching 1 as n . The MDL model-selection criterion also has a number of strong optimality properties, which greatly extend Shannon's noiseless coding theorem [5, §III.E]. In its simplest form, the MDL principle advocates choosing the model for which the observed data has the shortest message length under a particular prefix code defined by a minimax condition [11, §2.4.3]. Shtarkov [19] showed that this is equivalent to choosing the model with the largest normalized maximum likelihood (NML) for the observed data.

artificial intelligence, machine learning, parameter space, (16 more...)

arXiv.org Machine Learning

1510.00112

Genre:

Research Report > New Finding (0.34)
Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Add feedback

Volumes of logistic regression models with applications to model selection

Dowty, James G.

arXiv.org Machine LearningOct-16-2014

Logistic regression models with $n$ observations and $q$ linearly-independent covariates are shown to have Fisher information volumes which are bounded below by $\pi^q$ and above by ${n \choose q} \pi^q$. This is proved with a novel generalization of the classical theorems of Pythagoras and de Gua, which is of independent interest. The finding that the volume is always finite is new, and it implies that the volume can be directly interpreted as a measure of model complexity. The volume is shown to be a continuous function of the design matrix $X$ at generic $X$, but to be discontinuous in general. This means that models with sparse design matrices can be significantly less complex than nearby models, so the resulting model-selection criterion prefers sparse models. This is analogous to the way that $\ell^1$-regularisation tends to prefer sparse model fits, though in our case this behaviour arises spontaneously from general principles. Lastly, an unusual topological duality is shown to exist between the ideal boundaries of the natural and expectation parameter spaces of logistic regression models.

artificial intelligence, machine learning, matrix, (17 more...)

arXiv.org Machine Learning

1408.0881

Country: Europe > Austria (0.28)

Genre: